The Role of Cognates in Word Acquisition
PhD Defence / Departament de Medicina i Ciències de la Vida
2024-11-03
Average 20-year-old knows ~42,000 lemmas: mental lexicon
First lexical representations at 6-9 months
Figure 1: Vocabulary size norms for 51,800 monolingual children learning 35 distinct languages (wordbank)
Hoff et al. (2012): bilinguals acquire words at similar rates as monolinguals
47 English-Spanish bilinguals, 56 English monolinguals in Florida
Floccia et al. (2018): CDI response of 372 bilinguals (UK) learning English + additional language
English-Dutch (22.14%) > English-Mandarin (1.97%)
Higher lexical similarity, larger vocabulary size
Stronger effect in the additional language (e.g., Dutch, Mandarin)
Figure 2: Pairwise lexical similarity (average Levensthein similarity across translations).
Cognates: phonologically-similar translation equivalents
| Cognate | Non-cognate |
|---|---|
| [cat] /ˈgat-ˈga.to/ | [dog] /ˈgos-ˈpe.ro/ |
Some evidence that cognates acquired earlier than non-cognates (Mitchell, Tsui, and Byers-Heinlein 2023; Bosch and Ramon-Casas 2014)
What mechanisms support a cognate facilitation during word acquisition?
Activation spreads across non-selected representations in both languages, through phonological and conceptual links. (e.g., Costa, Caramazza, and Sebastian-Galles 2000)
Evidence in children (Bosma and Nota 2020; De Houwer, Bornstein, and Putnick 2014) and infants (Von Holzen and Mani 2012; Jardak and Byers-Heinlein 2019; Singh 2014).
Study 1
Study 2
Cognate beginnings to lexical acquisition: the AMBLA model
Exposure to a word-form that results in the accumulation of information about its meaning
\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{grey}{RGB}{128, 128, 128} \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ {\color{mygreen}\text{Age of Acquisition}_{ij}} &= \{\text{Age}_i \mid {\color{myred}\text{Learning instances}_{ij}} = {\color{myblue}\text{Threshold}} \}\\ \color{myred}{\text{Learning instances}_{ij}} &= \text{Age}_i \cdot \text{Freq}_j \\ \textbf{where:} \\ {\color{myblue}\text{Threshold}} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \end{aligned} \]
Catalan 60%, Spanish 40%
Exposure: proportion of time exposed to the language of \(j\) word
Accumulation of learning instances, a function of Exposure and Frequency.
\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{myorange}{RGB}{ 235, 127, 26 } \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ \text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\ \text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot {\color{myred}\text{Exposure}_{ij}}\\ \textbf{where:} \\ \text{Threshold} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \end{aligned} \]
Degree proportional to their phonological similarity (Cognateness)
\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{myorange}{RGB}{ 235, 127, 26 } \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ \text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\ \text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot \text{Exposure}_{ij} + \\ &({\color{myred}\text{Learning instances}_{ij'} \cdot {\text{Cognateness}}_{j}})\\ \textbf{where:} \\ \text{Threshold} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \\ {\color{myred}\text{Cognateness}}&{\color{myred} = \text{Levenshtein}(j, j')} \end{aligned} \]
p(Comprehension \(<\) Production) ~ Ordinal, multilevel (Bayesian) regression model
\[ \begin{aligned} \text{Exposure}_{ij} &= \text{Frequency}_j \times \text{Language degree of exposure}_{ij} \\ \text{Cognateness}_{j} &= \text{Levenshtein}(j, j') \end{aligned} \]
Figure 3: Marginal posterior predictions
Earlier acquisition for cognates vs. non-cognates
Cognate facilitation moderated by exposure
Only words from the lower exposure benefit from cognateness Parallel to language dominance effects in adults?
Cognateness as a candidate mechanism underlying Floccia et al. (2018)’s results
Cross-language facilitation via co-activation of phonologically similar translation equivalents
Is language-non selectivity already present?
Developmental trajectories of bilingual spoken word recognition
If language non-selectivity, stronger interference in cognate vs. non-cognate trials
Replication study
N = 112 children (15 longitudinal)
Aged 26.36 months (SD = 4.01, Range = 20.03–32.5)
English monolinguals, Oxford (United Kindgom) (as in Mani and Plunkett 2010)
Proportion of target looking (PTLT) ~ (Bayesian) GAMMs
Figure 4: Time course of target fixations in Experiment 1.
N = 162 children (81 longitudinal)
Exposed to Catalan or Spanish (Metropolitan Area of Barcelona, Spain)
Aged 25.36 months (SD = 4.01, Range = 20.03–32.5)
Figure 5: Time course of target fixations in Experiment 1.
Figure 6: Time course of target fixations in Experiment 1.
Successful word recognition across:
No evidence of priming effects, within or across languages
Most likely due to design issues
Cognateness facilitates word acquisition in the lower-exposure language
Candidate mechanism behind bilingual vocabulary growth
AMBLA: Cross-language accumulation of learning instances
Language non-selectivity in the initial lexicon: pending severe testing
Explanation for Floccia et. (2018)
Asymmetry in adult models of lexical processing
AMBLA: natural extension of the Standard Model of language acquisition? (Kachergis, Marchman, and Frank 2022)
Design caviats
Generalisability? Language pairs with fewer cognates
Does cognateness impact the acquisition of other grammatical categories (e.g., verbs, adjectives)
Word acquisition vs. word learning
Thanks
Figure 7: Aggregated vocabularies might conceal facilitation effects
Figure 8: Participant receptive vocabulary sizes across ages and language profiles.
Figure 9: Participant receptive vocabulary sizes across ages and language profiles.